ABSTRACT
A distributed system is a collection of independent entities that cooperate to solve a problem that cannot be
individually solved. Checkpoint is defined as a fault tolerant technique. It is a save state of a process during the
failure-free execution, enabling it to restart from this checkpointed state upon a failure to reduce the amount of
lost work instead of repeating the computation from beginning. The process of restoring form previous
checkpointed state is known as rollback recovery. A checkpoint can be saved on either the stable storage or the
volatile storage depending on the failure scenarios to be tolerated. Checkpointing is major challenge in mobile ad
hoc network. The mobile ad hoc network architecture is one consisting of a set of self configure mobile hosts(MH)
capable of communicating with each other without the assistance of base stations, some of processes running on
mobile host. The main issues of this environment are insufficient power and limited storage capacity. This paper
surveys the algorithms which have been reported in the literature for checkpointing in distributed systems as well
as Mobile Distributed systems.
Keywords: - Checkpointing, Distributed systems, Fault tolerance, Mobile computing system, Rollback recovery.